AITopics | neural network layer

Collaborating Authors

neural network layer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Kantorovich--Kernel Neural Operators: Approximation Theory, Asymptotics, and Neural Network Interpretation

He, Tian-Xiao

arXiv.org Machine LearningMar-30-2026

This paper studies a class of multivariate Kantorovich-kernel neural network operators, including the deep Kantorovich-type neural network operators studied by Sharma and Singh. We prove density results, establish quantitative convergence estimates, derive Voronovskaya-type theorems, analyze the limits of partial differential equations for deep composite operators, prove Korovkin-type theorems, and propose inversion theorems. This paper studies a class of multivariate Kantorovich-kernel neural network operators, including the deep Kantorovich-type neural network operators studied by Sharma and Singh. We prove density results, establish quantitative convergence estimates, derive Voronovskaya-type theorems, analyze the limits of partial differential equations for deep composite operators, prove Korovkin-type theorems, and propose inversion theorems. Furthermore, this paper discusses the connection between neural network architectures and the classical positive operators proposed by Chui, Hsu, He, Lorentz, and Korovkin.

artificial intelligence, machine learning, operator, (16 more...)

arXiv.org Machine Learning

2603.26418

Country:

North America > United States > Illinois (0.40)
North America > United States > New York (0.05)
North America > United States > Rhode Island > Providence County > Providence (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Hierarchical Lattice Layer for Partially Monotone Neural Networks

Neural Information Processing SystemsDec-24-2025, 03:29:28 GMT

Partially monotone regression is a regression analysis in which the target values are monotonically increasing with respect to a subset of input features. The TensorFlow Lattice library is one of the standard machine learning libraries for partially monotone regression. It consists of several neural network layers, and its core component is the lattice layer. One of the problems of the lattice layer is that it requires the projected gradient descent algorithm with many constraints to train it. Another problem is that it cannot receive a high-dimensional input vector due to the memory consumption. We propose a novel neural network layer, the hierarchical lattice layer (HLL), as an extension of the lattice layer so that we can use a standard stochastic gradient descent algorithm to train HLL while satisfying monotonicity constraints and so that it can receive a high-dimensional input vector. Our experiments demonstrate that HLL did not sacrifice its prediction performance on real datasets compared with the lattice layer.

hierarchical lattice layer, monotone neural network, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.84)

Add feedback

Hierarchical Lattice Layer for Partially Monotone Neural Networks

Neural Information Processing SystemsOct-10-2024, 21:59:04 GMT

hierarchical lattice layer, lattice layer, monotone neural network, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.90)

Add feedback

Reviews: Hyperbolic Neural Networks

Neural Information Processing SystemsOct-8-2024, 06:23:41 GMT

Thanks to the authors for the detailed response. The new results presented in the rebuttal are indeed convincing, hence I am updating my score to an 8 now. This is with the understanding that these would be incorporated in the revised version of the paper. Several works in the last year have explored using hyperbolic representations for data which exhibits hierarchical latent structure. Some promising results on the efficiency of these representations at capturing hierarchical relationships have been shown, most notably by Nickel & Kiela (Nips, 2017). However one big hindrance for utilizing them so far is the lack of deep neural network models which can consume these representations as input for some other downstream task.

hyperbolic neural network, neural network layer, representation, (5 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Learning from Demonstration with Implicit Nonlinear Dynamics Models

Fagan, Peter David, Ramamoorthy, Subramanian

arXiv.org Artificial IntelligenceOct-1-2024

Learning from Demonstration (LfD) is a useful paradigm for training policies that solve tasks involving complex motions, such as those encountered in robotic manipulation. In practice, the successful application of LfD requires overcoming error accumulation during policy execution, i.e. the problem of drift due to errors compounding over time and the consequent out-of-distribution behaviours. Existing works seek to address this problem through scaling data collection, correcting policy errors with a human-in-the-loop, temporally ensembling policy predictions or through learning a dynamical system model with convergence guarantees. In this work, we propose and validate an alternative approach to overcoming this issue. Inspired by reservoir computing, we develop a recurrent neural network layer that includes a fixed nonlinear dynamical system with tunable dynamical properties for modelling temporal dynamics. We validate the efficacy of our neural network layer on the task of reproducing human handwriting motions using the LASA Human Handwriting Dataset. Through empirical experiments we demonstrate that incorporating our layer into existing neural network architectures addresses the issue of compounding errors in LfD. Furthermore, we perform a comparative evaluation against existing approaches including a temporal ensemble of policy predictions and an Echo State Network (ESN) implementation. We find that our approach yields greater policy precision and robustness on the handwriting task while also generalising to multiple dynamics regimes and maintaining competitive latency scores.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2409.18768

Country:

Europe > United Kingdom (0.14)
Asia > China (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers

Apicella, Andrea, Isgrò, Francesco, Prevete, Roberto

arXiv.org Artificial IntelligenceNov-18-2023

In the context of classification problems, Deep Learning (DL) approaches represent state of art. Many DL approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks. The basic idea is that each hidden neural layer accomplishes a data transformation which is expected to make the data representation "somewhat more linearly separable" than the previous one to obtain a final data representation which is as linearly separable as possible. However, determining the appropriate neural network parameters that can perform these transformations is a critical problem. In this paper, we investigate the impact on deep network classifier performances of a training approach favouring solutions where data representations at the hidden layers have a higher degree of linear separability between the classes with respect to standard methods. To this aim, we propose a neural network architecture which induces an error function involving the outputs of all the network layers. Although similar approaches have already been partially discussed in the past literature, here we propose a new architecture with a novel error function and an extensive experimental analysis. This experimental analysis was made in the context of image classification tasks considering four widely used datasets. The results show that our approach improves the accuracy on the test set in all the considered cases.

architecture, data representation, representation, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.patrec.2023.11.016

2306.06146

Country: Europe > Italy > Campania > Naples (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

A New Computationally Simple Approach for Implementing Neural Networks with Output Hard Constraints

Konstantinov, Andrei V., Utkin, Lev V.

arXiv.org Artificial IntelligenceJul-19-2023

A new computationally simple method of imposing hard convex constraints on the neural network output values is proposed. The key idea behind the method is to map a vector of hidden parameters of the network to a point that is guaranteed to be inside the feasible set defined by a set of constraints. The mapping is implemented by the additional neural network layer with constraints for output. The proposed method is simply extended to the case when constraints are imposed not only on the output vectors, but also on joint constraints depending on inputs. The projection approach to imposing constraints on outputs can simply be implemented in the framework of the proposed method. It is shown how to incorporate different types of constraints into the proposed method, including linear and quadratic constraints, equality constraints, and dynamic constraints, constraints in the form of boundaries. An important feature of the method is its computational simplicity. Complexities of the forward pass of the proposed neural network layer by linear and quadratic constraints are O(n*m) and O(n^2*m), respectively, where n is the number of variables, m is the number of constraints. Numerical experiments illustrate the method by solving optimization and classification problems. The code implementing the method is publicly available.

artificial intelligence, constraint, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2307.10459

Country:

Asia > Russia (0.14)
Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

People detection and social distancing classification in smart cities for COVID-19 by using thermal images and deep learning algorithms

Elhanashi, Abdussalam, Saponara, Sergio, Gagliardi, Alessio

arXiv.org Artificial IntelligenceSep-10-2022

COVID-19 is a disease caused by severe respiratory syndrome coronavirus. It was identified in December 2019 in Wuhan, China. It has resulted in an ongoing pandemic that caused infected cases including some deaths. Coronavirus is primarily spread between people during close contact. Motivating to this notion, this research proposes an artificial intelligence system for social distancing classification of persons by using thermal images. By exploiting YOLOv2 (you look at once), a deep learning detection technique is developed for detecting and tracking people in indoor and outdoor scenarios. An algorithm is also implemented for measuring and classifying the distance between persons and automatically check if social distancing rules are respected or not. Hence, this work aims at minimizing the spread of the COVID-19 virus by evaluating if and how persons comply with social distancing rules. The proposed approach is applied to images acquired through thermal cameras, to establish a complete AI system for people tracking, social distancing classification, and body temperature monitoring. The training phase is done with two datasets captured from different thermal cameras. Ground Truth Labeler app is used for labeling the persons in the images. The achieved results show that the proposed method is suitable for the creation of a smart surveillance system in smart cities for people detection, social distancing classification, and body temperature analysis.

artificial intelligence, machine learning, social distancing classification, (14 more...)

arXiv.org Artificial Intelligence

2209.04704

Country: Asia > China > Hubei Province > Wuhan (0.25)

Genre: Research Report > New Finding (0.35)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural network layers as parametric spans

Bergomi, Mattia G., Vertechi, Pietro

arXiv.org Artificial IntelligenceSep-3-2022

Properties such as composability and automatic differentiation made artificial neural networks a pervasive tool in applications. Tackling more challenging problems caused neural networks to progressively become more complex and thus difficult to define from a mathematical perspective. We present a general definition of linear layer arising from a categorical framework based on the notions of integration theory and parametric spans. This definition generalizes and encompasses classical layers (e.g., dense, convolutional), while guaranteeing existence and computability of the layer's derivatives for backpropagation.

integration theory, neural network, parametric span, (13 more...)

arXiv.org Artificial Intelligence

2208.00809

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Groovy: Detecting objects with Groovy, the Deep Java Library (DJL), and Apache MXNet

#artificialintelligenceAug-1-2022, 11:52:26 GMT

This blog posts looks at using Apache Groovy with the Deep Java Library (DJL) and backed by the Apache MXNet engine to detect objects within an image. Deep learning falls under the branches of machine learning and artificial intelligence. It involves multiple layers (hence the "deep") of an artificial neural network. There are lots of ways to configure such networks and the details are beyond the scope of this blog post, but we can give some basic details. We will have four input nodes corresponding to the measurements of our four characteristics.

apache mxnet, application, groovy, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback